Matching isotopic distributions from metabolically labeled samples

نویسندگان

  • Sean McIlwain
  • David Page
  • Edward L. Huttlin
  • Michael R. Sussman
چکیده

MOTIVATION In recent years stable isotopic labeling has become a standard approach for quantitative proteomic analyses. Among the many available isotopic labeling strategies, metabolic labeling is attractive for the excellent internal control it provides. However, analysis of data from metabolic labeling experiments can be complicated because the spacing between labeled and unlabeled forms of each peptide depends on its sequence, and is thus variable from analyte to analyte. As a result, one generally needs to know the sequence of a peptide to identify its matching isotopic distributions in an automated fashion. In some experimental situations it would be necessary or desirable to match pairs of labeled and unlabeled peaks from peptides of unknown sequence. This article addresses this largely overlooked problem in the analysis of quantitative mass spectrometry data by presenting an algorithm that not only identifies isotopic distributions within a mass spectrum, but also annotates matches between natural abundance light isotopic distributions and their metabolically labeled counterparts. This algorithm is designed in two stages: first we annotate the isotopic peaks using a modified version of the IDM algorithm described last year; then we use a probabilistic classifier that is supplemented by dynamic programming to find the metabolically labeled matched isotopic pairs. Such a method is needed for high-throughput quantitative proteomic metabolomic experiments measured via mass spectrometry. RESULTS The primary result of this article is that the dynamic programming approach performs well given perfect isotopic distribution annotations. Our algorithm achieves a true positive rate of 99% and a false positive rate of 1% using perfect isotopic distribution annotations. When the isotopic distributions are annotated given 'expert' selected peaks, the same algorithm gets a true positive rate of 77% and a false positive rate of 1%. Finally, when annotating using machine selected peaks, which may contain noise, the dynamic programming algorithm gives a true positive rate of 36% and a false positive rate of 1%. It is important to mention that these rates arise from the requirement of exact annotations of both the light and heavy isotopic distributions. In our evaluations, a match is considered 'entirely incorrect' if it is missing even one peak or containing an extraneous peak. If we only require that the 'monoisotopic' peaks exist within the two matched distributions, our algorithm obtains a positive rate of 45% and a false positive rate of 1% on the 'machine' selected data. Changes to the algorithm's scoring function and training example generation improves our 'monoisotopic' peak score true positive rate to 65% while obtaining a false positive rate of 2%. All results were obtained within 10-fold cross-validation of 41 mass spectra with a mass-to-charge range of 800-4000 m/z. There are a total of 713 isotopic distributions and 255 matched isotopic pairs that are hand-annotated for this study. AVAILABILITY Programs are available via http://www.cs.wisc.edu/~mcilwain/IDM/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Method for estimating the isotopic distributions of metabolically labeled proteins by MALDI-TOFMS: application to NMR samples.

We have developed an efficient method of estimating metabolic incorporation of heavy isotopes into proteins, including those where a single amino acid carries the label. The protein is digested with trypsin, and the resulting peptide mixture is examined directly by MALDI-TOF mass spectrometry. Peptides are chosen for analysis if they contain one or more labeled atoms and also exhibit clearly se...

متن کامل

Quantification of Peptide m/z Distributions from 13C-Labeled Cultures with High-Resolution Mass Spectrometry

Isotopic labeling studies of primary metabolism frequently utilize GC/MS to quantify (13)C in protein-hydrolyzed amino acids. During processing some amino acids are degraded, which reduces the size of the measurement set. The advent of high-resolution mass spectrometers provides a tool to assess molecular masses of peptides with great precision and accuracy and computationally infer information...

متن کامل

Isotopic analysis of mineral phases to unravel the origin of altered volcanic rocks: an example from the Leucite Hills lamproites

Study of lamproites from Leucite Hills, Wyoming, indicates that the isotopic compositions of some specimens have been modified due to the alteration and/or the presence of secondary carbonate impurities within the whole rocks. Leachate test shows that while phlogopite lamproites are not affected by secondary processes, the transitional madupitic lamproites from Middle Table Mountain and on...

متن کامل

Split-field drift tube/mass spectrometry and isotopic labeling techniques for determination of single amino acid polymorphisms.

A combination of split-field drift tube/mass spectrometry and isotopic labeling techniques is evaluated as a means of identifying single amino acid polymorphisms (SAAPs) in proteins. The method is demonstrated using cytochromec (equine and bovine) and hemoglobin (bovine and sheep). For these studies, proteins from different species are digested with trypsin, and the peptides are labeled at prim...

متن کامل

Sample-oriented Domain Adaptation for Image Classification

Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2008